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l.  IK  : 

As  wo  know  from  the  F-test  in  the  analysis  of  variance,  we  usually 
compare  the  observed  F-value  which  is  computed  from  the  data  with  the 
theoretical  F-value  in  the  conventional  F  table.   The  theoretical  F-values 
in  the  Table  arc  taken  from  the  F-distributions  with  the  corresponding 
degrees  of  freedom  and  the  specified  probabilities.   The  F-distribution 
is  obtained  under  the  condition  that  the  population  is  normal.  So  the 
population  of  the  data  in  which  the  observed  F-value  is  obtained  should 
also  be  normal.   This  is  the  assumption  we  need  in  the  analysis  of  variance. 

In  ma:  y   situations  the  population  of  the  data  may  not  meet  this  con- 
dition.  If  the  shape  of  the  population  distribution  function  is  known, 
then  we  can  use  the  proper  transformation  to  make  the  data  satisfy  this 
essential  condition.   Otherwise,  many  nonparametric  methods  can  be  used. 
This  report  will  deal  mainly  with  the  chi-square  test  in  the  randomized 
complete  block  design  case.  A  large  sample  is  necessary  for  using  this 
method  and  the  minimum  sample  size  can  be  reached  by  a  working  rule  stated 
in  Section  2. 

In  the  second  section  we  state  the  difference  between  a  two-way  random- 
ized complete  block  arrangement  table  and  a  two-way  contingency  table,  with 
the  binomial  transformation  using  the  pooled  median  changing  the  former  to 
the  latter  one. 

The  third  section  discusses  the  test  of  independence  between  two  at- 
tributes in  x2-test ,  which  is  comparable  to  testing  the  interaction  of  two 
attributes  in  the  analysis  of  variance  case. 

The  fourth  and  fifth  sections  deal  with  the  methods  to  compute  various 
X^'s  concerned  with  different  types  of  experimental  data,  in  which,  of 


course,  the  contingency  table  should  be  formed  at  first. 

The  sixth  section  contains  the  concepts  about  the  expected  fre- 
quencies of  x^-test.   in  the  seventh  section  appears  a  normal  score  trans- 
formation.  This  is  introduced  by  Fisher  and  Yates  (191*3)  and  is  used  for 
the  ranked  data.  If  we  transform  the  quantitative  data  into  ranks  at  first, 
the  numerical  data  can  also  be  analyzed  by  this  method.   The  last  two 
sections  compare  the  method  of  x  -test  and  F-test.   The  F-test  is  better 
for  normal  populations  and  the  x2-test  needs  larger  samples  to  have  the 
same  power  as  the  F-test.   Some  comments  arose  about  Wilson's  x  -test  from 
Sheffield  and  McNemar,  who  indicated  that  the  x2-test  has  less  power  than 
F-test. 

It  is  true  that  if  the  population  of  the  data  is  normal,  the  F-test 
is  better  than  any  other  method;  otherwise,  if  the  data  is  not  drawn  from 
the  normal  population  then  the  F-test  is  no  longer  the  better  one.  The 
non-parametric  methods  are  like  wearing  loose  suits  made  to  cover  most 
people  but  not  giving  them  a  good  fit.  The  transformation  is  used  in 
statistical  methods  to  transform  the  data  into  a  normal  distribution  to 
meet  the  test  assumption.  It  seems  to  change  people's  weight  to  fit  them 
into  the  proper  suits.  When  all  of  these  methods  are  used,  we  may  certainly 
have  something  to  gain  and  also  something  to  lose.  Therefore,  if  we  can 
find  the  proper  method  of  analysis  for  every  kind  of  population,  this  is 
the  best  way  to  do  our  job. 


2.  Randomized  Complete  Block  Arrangement  and  the  Contingency  Table. 

2.1  The  Difference  Between  the  Randomized  Complete  Block  Two-Way  Table 
and  the  Contingency  Table. 

The  data  of  a  randomized  complete  block  design  is  generally  of  two 
way  classification  with  one  observation  in  each  cell  or  plot.  The  obser- 
vations in  the  cell  are  usually  numerical  measurements. 

This  design  is  devised  to  compare  t  treatments  in  n  plots ,  with  each 
treatment  replicated  in  b  plots,  so  that  bt  equals  n.   The  n  plots  are 
divided  into  b  blocks,  such  that  within  any  block  the  plots  are  as  homo- 
geneous as  possible,  and  the  variation  among  blocks  is  known.  The  t 
treatments  are  randomly  allocated  to  the  t  plots  in  each  block.  With  b 
replications,  we  require  b  separate  randomizations.  A  two-way  classifi- 
cation table  of  such  an  arrangement  for  a  randomized  complete  block  design 
is  given  in  Table  2.2.1. 

If  the  observations  in  the  two-way  table  of  the  randomized  complete 
block  design  are  replaced  by  frequencies ,  that  table  becomes  a  two-way 
frequency  table.  The  treatment  and  block  are  two  classified  attributes. 
This  is  generally  called  a  two-way  r  x  c  contingency  table,  (Table  2.2.2). 

2.2  The  Change  from  Randomized  Complete  Block  Two-Way  Table  into  a 
Contingency  Table. 

A  binomial  transformation  can  be  used  to  change  the  randomized  complete 
block  two-way  table  into  a  contingency  table.  Table  2.2.3  for  example,  is 
the  transformed  form  of  Table  2.2.1.   The  method  of  transformation  is  at 
first  to  find  the  median  of  each  block  and  then  replace  each  observation 
.  its  respective  block  median  by  1  and  below  or  equal  to  its  block 


Table     2.2.1 
Two-way  Classification  Table  of  RCB  Design 
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Where  y.      Is  the  observation  of  the  i       treatment  and  the  j       block 

th 
y.      is  the  sum  of  the  l       treatment 

th 

y.   is  the  mean  of  the  i   treatment 

th 
y  .  is  the  sum  of  the  j   block 

y   is  the  mean  of  the  j   block 

y   is  the  grand  total  of  all  n  observations 

y   is  the  grand  mean  of  all  n  observations 


Table  2.2.2 
Y  x  c  Contingency  Table 
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Where  n.  is  the  number  of  observations  of  the  i   treatment  and 

the  j   block 
n   is  the  total  frequencies ,  or  the  total  number  of  the 

observations  in  the  design 
n.  is  the  sum  of  frequencies  of  the  i   treatment 

n   is  the  sun  of  frequencies  of  the  J   block 


median  by  0,  then  the  number  of  I's  for  each  treatment  is  considered  to  be 
the  frequencies  of  the  successes,  af . ,  and  that  of  0's  is  considered  that 
of  tl-.c  failures,  bf..  Such  a  2  x  r  contingency  table  is  obtained  from 
the  data  of  randomized  complete  block  design  as  in  Table  2.2.3. 

Table  2.2.3. 
2  x  r  Contingency  Table  where  'a'  Means  Above  The  Median 
and  'b'  Means  Below  or  Equal  to  The  Median 
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where     'a'  means  above  the  median 

'b'  means  below  or  equal  to  the  median 

af.  is  the  number  of  observations  above  the  median  in  the 

l 

i   treatment 

bf.  is  the  number  of  observations  below  or  equal  to  the 

median  of  the  i   treatment 

n  is  the  number  of  total  observations  of  above  the  median 
a 

n.  is  the  number  of  total  observations  of  below  or  equal  to 

the  median 

n  is  the  number  of  total  observations  of  all  observations 


In  the  two-way  table  of  a  randomized  complete  block  design,  if  the 
numter  of  observations  in  each  cell  is  more  than  one,  then  the  method  of 
transformation  is  slightly  different  from  the  preceding  one.  That  is, 
the  median  used  here  is  the  pooled  median,  Md,  which  is  obtained  from  all 
the  n  observations  instead  of  from  each  block.  The  number  of  successes  and 
failures  for  each  treatment  is  determined  by  counting  the  number  of  ob- 
servations above  and  the  number  below  or  equal  to  the  pooled  median,  Md. 

The  binomial  transformation  for  a  randomized  complete  block  experi- 
ment can  be  used  only  if  both  t  and  b  are  large  enough  to  make  all  the 
expected  frequencies  greater  than  or  equal  to  5-   That  is,  in  binomial 
populations  both  np  and  n(l-p)  or  nq  should  be  greater  than  or  equal  to  5. 
This  is  a  working  rule  for  making  the  transformation  effectively. 

3.   Interaction  and  Independence. 

An  r  x  c  two-way  contingency  table  is  usually  constructed  for  the 
purpose  of  studying  the  relationship  between  two  attributes.  In  particular, 
we  may  wish  to  tes"c  whether  the  two  attributes  are  related  and  dependent. 
If  the  two  attributes  are  not  related  to  each  other,  this  means  they  are 
independent.  On  the  other  hand,  if  the  two-way  table  is  numerical 
measurement  data,  independence  indicates  no  interaction  between  these  two 
attributes.  Thus  we  test  interaction  between  two  attributes  in  numerical 
measurement  data  in  the  same  sense  as  we  test  independence  between  two 
attributes  in  a  r  x  c  contingency  table.  The  following  simple  2x2  table 
of  artificial  data  is  a  numerical  example  to  illustrate  no  interaction 
between  two  attributes,  A  and  E. 


Table  3.1 
A  2  x  2  Table  of  Artificial  Data 


A 

B 

Total 

1 

2 

1 

10 

12 

22 

C 

13 

15 

28 

Total 

23 

27 

50 

To  see  this,  we  could  check  10  -  12  =  13  -  15  and  10  -  13  =  12  -  15, 
this  means  that  the  difference  between  the  observations  corresponding  to 
tass   two  levels  of  A  is  the  sane  for  all  levels  of  B,  and  the  difference 
between  the  observations  for  two  levels  of  B  is  the  same  for  all  levels 
of  A.  This  means  that  there  is  no  interaction  between  two  attributes  of 
A  and  B.  On  the  other  hand,  if  we  consider  a  2  x  2  contingency  table  and 
let  A  be  the  variate,  in  which  A  is  "success"  and  A  is  "failure",  then 
the  data  becomes  a  binomial  form  so  that  B  and  B  are  two  binomial  samples. 
Now  we  see  that  the  two  relative  frequencies  or  two  binomial  sample  means 
are  approximately  equal,  or  10/23  =  12/27  =  22/50  =  O.U.  Therefore,  we 
would  say  that  the  two  attributes  A  and  B  are  independent  or  the  two  bi- 
nomial sample  means  are  approximately  equal.  The  reason  that  they  are  not 
exactly  equal  is  accounted  for  by  the  sampling  variation.   Nevertheless, 
from  this  point  of  view,  we  know  that  the  purpose  of  testing  hypotheses 
of  interaction  for  numerical  two-way  data  and  that  of  independence  for 
two-way  cc  data  is  the  same. 


The  x  -Test   for  Randomized  Complete  Block  Designs. 


It.l     One  Observation  Per  Cell. 

The   chi-snuare  test   in  nonparametri c  methods   may  be  used  in  many 
;.;   like  the   analysis    of  variance   in  parametric  methods  to  test  the 
hypothesis   that  the  r  samples   are   drawn  from  the   same  population,    or  that 
the   r  population  means   are  equal.      The   difference  between  them  is  that  the 
X2  test   deals  with  multinomial  populations,  while  the   analysis   of  variance 
deals  with  normal  populations.      Thus,    for  the  non-parametric   analysis   of 
randomized  complete  block  experimental  data,  we  may  at  first  transform  the 
two-way  table  of  numerical  observations   into   a  2  x  r  two-way   frequency 
contingency  table,  which  is  shown  in  the  Table  2.2.3. 

After  the  2  x  r  contingency  table  is   obtained,  we  can  compute  the 
statistic  x2   as   follows: 


I 

i=l 


n.n     2 

(af.    -  -^)  (bf. 


■v 


(a:\/j  (af2)' 


(af.)' 


(af    )2        (n    f 


(4.1.1) 


which  is  approximately  chi-square  with  (r  -  l)  degrees  of  freedom,  where 
all  the  notations  in  the  formula  (U.l.l)  are  the  same  as  in  Table  2.2.3. 

As  we  mentioned  in  the  previous  section,  the  binomial  transformation 
for  a  randomised  complete  block  experiment  in  many  cases  can  be  used  only 
if  both  r  and  b  are  large.  If  r,  the  number  of  treatments,  is  small,  the 


/"-value  needs  to  be  corrected.   The  corrected  value  is 


10 


This  correction  term  originated  from  the  relation  between  the  chi- 
square  test  of  independence  and  the  analysis  of  variance. 


For  the  case  of  one  observation  per  cell  in  randomized  complete  block 
design,  Friedman  (1937)  suggested  that  a  quick  method  to  test  the  same 
hypothesis  that  r  population  means  are  equal  is  at  first  to  rank  the  ob- 
servations in  each  block  from  1  to  b.  Let  E.  be  the  sum  of  the  ranks  of 
the  observations  from  the  i   treatment,  we  may  compute 

4'Wmi)  I   (Ri.)2  ~  Mx  *  1}  (fc.1.1.1) 

Where    b  is  the  number  of  blocks  or  replicates 
r  is  the  number  of  treatments 
R   is  the  sum  of  ranks  in  the  i   treatment. 

Under  the  null  hypothesis,  this  statistic,  x2,  is  distributed  approxi- 

F 

mately  as  x2  distribution  with  (r  -  l)  degrees  of  freedom. 

The  integers  12  and  3  in  the  formula  are  constants,  not  dependent 
on  the  size  of  the  experiment.  This  approximation  is  poor  for  small 
values  of  r  and  b.  Friedman  has  prepared  tables  (Siegel  1956)  of  the 

exact  distribution  of  x2,  for  some  pairs  of  small  values  of  r  and  b. 

r 

4.1.2  Cochran's  Q-Test 

Another  method  for  the  same  case  contributed  by  Cochran  (Siegel  1956) 
is  the  Q-test.  This  test  is  particularly  suitable  when  the  data  are  in  a 


11 


or  dichotomized  ordinal  scale,   such  an    'yen'    or    'no';    'alive'    or 
'allure',   and  no  on.     This   tent   determines  whether 
ne    from   the   same  population  with   renpect  to  the 
frequency  of  successes   in  the  various  samples. 

steps    for  this   tent   are   at   first   in  the  two-way  table ,   to  as- 
sign a   '1'    to  each    'success'    and  a   '0'    to  each    'failure',   and  then  to 
determine  the  statistic  Q  by  substituting  the   observed  values   into  the 
following  formula; 


ix 


\          T         O  r              c    1 

1)      r     [     G2  -    (    I     G.)2 

ii      1  ii      x 

L    1=1  1=1               J 


Q  =  5 1 .  (k.1.2.1) 

j=l     J       J=l     J 

where   G.  is  the  total  number  of  'successes'  in  the  i   treatment 

L.   is  the  total  number  of  'successes'  in  the  j   block 

r  is  the  number  of  treatments 

b  is  the  number  of  blocks  (replications). 
under  the  hypothesis  that  the  r  population  means  are  equal  this  Q-value 
is  distributed  approximately  as  chi-square  distribution  with  (r  -  1) 
degrees  of  freedom. 

The  significance  of  the  observed  value  of  Q  may  be  determined  by 
reference  to  an  ordinary  x2-table. 


4.2  More  Observations  Per  Cell 


Suppose  that  there  are  r  rows,  c  columns,  and  h  observations  per  cell. 

rvations  are  denoted  by  y    with  i  =  1,2.. ...r:  j  =  1,2,. ...c;  and 

ijk 

.:  =  1,2,..., h.     The  two-way  table  can  be  transformed  into  a  2  x  r  x  c 
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icy  table,    (Table  It. 2.1)  by  using  the  pooled  median,  Md.      Thin 
table   can  also  be  written  as   Table  it. 2. 2. 

From  Table  It. 2. 2  the  total  x2_value   can  be  calculated  in  general  as 


I       I 
1=1  J-l 


n.  ,   n 

(af.,   --iL-£) 

1.1  r, 

n.      n 
1.1      a 


(bf 


\A 


2  -i 


^L 


(U.2.1) 


with  (re  -  l)  degrees  of  freedom. 

The  hypothesis  tested  for  this  case  is  that  the  main  effects  and 

interaction  effects  produce  no  change  in  the  distribution  of  the  data 

population.  If  the  number  of  observations  for  each  cell  of  the  r  x  c 

table,  n.  =  af.  +  bf .  ,  are  all  equal,  and  if  n  =  n,  =  —  ,  then  x£ 
ij     ij     ij  a    b   2        1 

can  be  written  as 


v2  ,  hss.  y     y  (af     .  -^_)2 


(It. 2. 2) 


ax.-   also  if  n  4   n.  ,  but  all  n.  .  are  equal,  then  Xm  can  be  expressed  as 

a    0  IJ  1 


I    I 
i=l  J-l 


n  2  i  2 

(af..  --&)     (bf.,  --ft) 

I."    re        1.1   re 

"b 


(U. 2. 3) 


For  computing  row  or  treatment  x2,  aii^   column  or  block  xiU  we  could 
change  Table  I*. 2.1  into  the  form  of  Table  It. 2. 3  and  Table  U.2.H,  or  namely 
2  x  r  and  2  x  c  contingency  tables  respectively,  then  the  two  statistics 
are  in  general 
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Tabic  U. 2.1 
x  r  x  c  ContinReney  Tablo  with  "a"  Means  Above  and  "b" 

Means  Below  or  Equal  to  the  Median,  Md. 
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bfil 

bfi2           • 

bf. 

bf. 

1C 

bn. 

l. 

l. 

; 

: 

r 

a 

af  , 

afr2          • 

af   , 

af 
re 

an 
r. 

b 

bfrl 

'°fr2          • 

bf  ,              . • • 

bf 
re 

bn 
r. 

r. 

Totals 

a 

M.l 

an  2 

•J 

an 

.0 

n 

a 

b 

!   brM 

ta.2          • 

bn    . 
•  J 

bn 
.c 

% 



.    ._ 

n.l 

n.2          • 

nmJ 

n 
.c 

n 

Table  >i.2.2 
x  re  Contingency 


ik 


11 

12 

lc    21   ...    31 

rl 

re 

Total 

!X 

afll 

...   aric  af21  ...   af3   .. 

af    ... 

rl 

af 

re 

n 
a 

b 

bfll 

bf12 

...   bflc  bf21  ...   bf3.   .. 

bf    ... 
rl 

bf 
re 

"b 

"ll 

ni2 

•••    nlc   n21  •••    n31  •• 

n 

rl 

n 
re 

Q 

Table  4.2.3 
2  x  r  Contingency  "able 

1 

3_ 

2      ...        i 

...       r 

Total 

j 

afi. 

af2,    •••       afi. 

af 
r. 

n 
a 

b 

bfi. 

bf2.     •••       bfi. 

bf 
r. 

a. 

D 

Total 

ni. 

=2.     •••        ni. 

. . .       n 

r. 

n 

Table   U.2.U 
2  x  c  Contingency  Table 


1 

2 

j 

c 

Total 

a 

af.l 

af.2 

... 

afo 

af 
.c 

n 
a 

b 

bf.l 

=  f.2 

bf  , 

■  i 

bf 
.c 

"b 

Tot  6.1 

?.l 

n.2 

n.J 

n 
■  e 

n 
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r 

XR   =    J, 

1=1 

n.    n     2                         n.    n     2 

(af.      -    -k-&)           (bf.      -  -k-*) 
i  .            n                      i .            n 

(U.2.M 

n.    n                                11,    U. 
i .   a                            i .   b 

n                                    n 

c 

with   (r  -  i)   decrees  of   freedom,  where  n       =     J  n      ,   and 

r 

c 

c    j=l 

n     n     2                          n    ,n,    2 

(U.2.5) 

n.,ina                             ".A 

L 

n                                       n 

r 

with   (c  -  1)    degrees   of  freedom,  where  a       =     l  n.       . 

•J        i=1  ij 

If         =  n     =  n/2,   end  all  n.      are  equal,  the  following  two  expressions 

"a          d                                       lj 

can  be  used 

4'^  I^.-*?)8                                       {k-2-6) 

c 

where   flf       =     I  df  .    ; 
J=l     1J 

r 

where  of    .   =     V  bf . ,    . 

Also,    if  n     ^  n,    hut   all  n.      are   equal,   the   following  two  formulas 

ah                        ij 

r.ay  be  used. 
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4 


I 
i=i 


(af. 


n  2 


n.  2 


4)    (M,   --*) 


(U.2.8) 


I 


(af 


n  2 


('of 

■l1 

c 

(U.2.9) 


To  de„ect  the  interaction  effect  of  row  and  column  we  can  compute  xi 
by  subtracting,  as  is  done  in  analysis  of  variance.  That  is 


„2  _  v2 


"I   XT   XR  "  XC 
with  (r  -  l)(c  -  l)  degrees  of  freedom. 


(U.2.10) 


The  general  expression  for  x2  is  fairly  complex  and  is  given  by 


Rao  (1952). 


5.  Extension  of  Randomized  Complete  Block  Design. 

5.1  Randomized  Complete  Block  Design  with  Two  Treatments  with  One  Obser- 
vation Per  Cell. 

If  only  two  treatments  and  b  blocks  are  contained  in  the  experimental 
data,  the  sign  test  may  be  used,  and  the  computing  method  for  this  case  is 
that  a  plus  or  minus  sign  is  given  to  each  difference  of  the  b  blocks, 
depending  on  whether  the  observation  of  the  first  treatment  is  greater  or 
less  than  the  observation  of  the  second  treatment.  If  there  is  no  dif- 
ference between  the  two  treatments ,  plus  and  minus  signs  occur  with  eq.ual 
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probability.  If  the  effect  of  the  first  treatment  is  greater  than  that 
of  the  second  treatment  one  can  expect  an  excess  of  plus  signs ,  otherwise 
a  deficit  in  plus  sipns.  Therefore,  the  hypothesis  that  two  treatment 

•e  equal  is  the  same  as  that  the  probability  of  a  plus  sign  is 
equal  to  0.5,  or  p  =  0.5. 

Here  again,  a  nonparametric  method  is  essentially  the  binomial  trans- 
formation. To  test  the  hypothesis  that  p  =  0.5,  a  x2-test  may  be  used, 
provided  that  the  number  of  blocks  is  greater  than  or  equal  to  10,  by 
the  working  rule  bp  >_  5.0. 

Strictly  speaking,  the  sign  test  is  applicable  only  to  the  case  in 

.  all  the  b  signs  are  either  positive  or  negative.  But  in  practice 
•che  two  observations  of  a  block  are  sometimes  equal.  When  this  occurs, 
such  a  block  nay  be  excluded  from  the  test. 

The  x2  -  value  of  the  sign  test  is  exactly  the  corrected  chi-square 
X2  Ct. 1.2)  for  the  randomized  complete  block  experiment  with  2  treatments 
and  b  blocks.  This  relation  can  be  shown  algebraically.  The  median  of 
a  block  is  the  average  of  the  two  observations  in  that  block.  A  plus  sign 
implies  that  the  first  observation  is  greater  than  the  second  one  in  that 
block.  Therefore,  the  number  of  observations  greater  than  their  block 
medians  for  the  first  treatment  equals  the  number  of  observations  less  than 
their  block  medians  for  the  second  treatment.  Therefore,  the  2x2  con- 
is  as  follows: 

treat  1     treat  2     totals 
no.  of  +'s 


t  ;a_:; 


T 

b  -  T 

b 

b  -  T 

T 

b 

b 

b 

2b 
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]  itter  ?  in  the  above  table  is  the  number  of  plus  signs.     By  the 
sign  test 


(T  -  \Y 


(2T 


(5.1.1) 


2                      >2 

T      ,    (b  -  TT 

b              b 

b2^ 

2b 

1  1 

2  '    2 

X    t(0.5)(0.5)     "  b 
By  the  method  for  randomized  complete  block  experiment  and  formula  (It. 1.2) 


(5.1.2) 


which  can  be  reduced  to  the  same  expression  given  in  formula  (5.1.1). 

Other  methods  of  nonparametric  analysis  for  two  related  samples  may 
be  found  in  Siegel  (l95<5). 

5.2  Randomized  Complete  Block  Design  with  Two  Factors  and  no  Combination 

If  the  treatment  contains  two  forms ,  A  and  C ,  both  at  m  levels  and 
also  if  there  are  b  blocks  in  the  experiment,  the  two-way  arrangement  is 
as  given  in  Table  5.2.1. 

For  this  data  we  may  find  the  difference  between  corresponding  levels 
of  factor  A  and  factor  C  in  the  b  blocks . 

To  find  the  interaction  between  the  factors  and  the  blocks  ,  the 
method  is  to  tabulate  the  differences  between  values  at  corresponding 
levels  for  these  two  factors  under  the  blocks.  Then  the  next  step  is  to 
determine  the  ranks  of  the  differences  (Table  5.2.2). 

The  following  xZ  ""  value  can  be  used  to  test  the  hypothesis  that 
two  factors  nave  no  interaction  with  blocks. 
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.10  5.2.1 
of  Two  Factor  with  No  Combination  in  RCB  Design 


A 

1 

all 

ai2 

alb 

al. 

2 

a21 

a22 

a2b 

a2. 

m 

anl 

am2 

amb 

a 

m. 

C 

1 

°11 

C12 

Clb 

Cl. 

2 

C21 

°22 

C2b 

C2. 

m 

Cml 

°m2 

mb 

c 

m. 

■  /.  12_   .  ■,  I     r2   -  3m(b  +  1) 
mb(b  +  1)  *j.  j 


(5.2.1) 


with  (b  -  1)  degrees  of  freedom,  where  b  is  the  number  of  blocks,  m  is  the 
number  of  levels,  and  r.  is  the  sum  of  ranks  in  the  J   block. 


Table  5.2.2 
The  Difference  and  Rank  Table  of  A-C 


Level 

Difference 
in  Block  I 

Rank 

Difference 
in  Block  II 

Rank 

... 

Difference 
in  Block  b 

Rank 

1 
2 

:. 

an  "  Bu 

a21  "  C21 
ml    ml 

ai2  "  C12 
a22  "  °22 

am2   °m2 

... 

aib  "  Clb 
a2b  "  C2b 

mb    nb 

r-  1 

r2 

... 

rb 
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The  resultant  x2-value  can  be  compared  with  that  of  the  conventional 
ble  with  respective  degrees  of  freedom. 

5.3  Randomized  Complete  Block  Design  vith  Three  Factors  and  No  Combination 

If  three  factors ,  A ,  B  and  C ,  are  involved  in  the  treatment  for  a 
randomized  complete  block  design,  then  the  x|  is  the  sum  of  x2ls.  One  is 
obtained  by  finding  the  difference ,  A  -  B  as  the  same  manner  shown  in 
Table  5-2.2  for  different  blocks  as  in  the  last  section,  and  another  x2 
is  obtained  by  finding  A  +  B  -  2C  for  all  blocks. 

$.k     Randomized  Complete  Block  Design  with  Four  Factors  and  No  Combination 

In  this  case,  we  can  use  a  similar  procedure  to  find  three  components 
of  x2-  That  is,  the  first  x2  is  obtained  by  finding  the  difference  of 
A  -  B,  the  second  x2  is  by  finding  A  +  B  -  2C,  and  the  third  x2  is  by 
finding  A  +  B  +  C  -  3D,  and  thus  xl  is  the  sum  of  them. 

If  more  than  four  factors  are  involved  in  the  treatment  with  no  combin- 
ation, the  method  is  the  extension  of  the  previous  ones. 

5.5  Randomized  Complete  Block  Design  with  Two  Factors  and  With  Each  Cell 
Containing  More  Than  One  Observation. 

If  the  randomized  complete  block  design  includes  two  factors ,  the 

first  factor  has  r  levels,  and  the  second  factor  has  c  levels.  Then  there 

are  re  treatment  combinations .  Each  treatment  combination  is  repeated  in 

b  olots,  and  each  plot  contains  n. .,  observations.  Then,  by  using  the 

ijk 

binomial  transformation,  a  2  x  rob  frequency  contingency  table  can  be 
obtained  as  the  following  table. 
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Table  5.5.1 
2  x  rfcl   Contingency  Table  with    'a'    and    'V    Means   Above   and 
Below  or  Equal  to  the  Median,  Md. 


111 

112 

ijk 

rcb 

Total 

a 

aflll 

af112 

•••         afijk 

af 
rcb 

n 
a 

b 

bflll 

bf112 

"•         bfiJk 

bf     . 
rcb 

nb 

Total 

r-lll 

nijk 

n     . 

n 

th 
af .        is  the  number  of  observations  in  the  ijk       cell  which  are  greater 
ijk 


:.    ■■■.■.:■ 


th 


bf .   ,    is  the  number  of  observations  in  the  ijk       cell  which  are  less 
ijk 

than  or  equal  to  Md. 
From  this  table  we   can  compute  the  total  chi-square  to  test  the 
hypothesis  that  the  main  effects  and  interaction  effects  make  no  difference 
in  the  population  distribution  of  the  data.     This   statistic  can  be  ex- 
pressed as 


rcb 

4-1    I    I 

i=i  3=1  k=l 


n.„n     2 

i,1k   a 


)  (bf. 


1.1k     Oy 


i.1k n_ 


n.  .,  n 

ijk  a 


"ijA 


(5.5.1) 


with  (rcb  -  l)  degrees  of  freedom,  where  n.  .,  =  af .   +  bf .    . 

ijk  ijk  ijk 

Chi-squares   for  three  main  effects ,  namely  the  two  factor  effects 

and  the  block  effect,   are   computed  in  the   same  manner. 
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AR   . 


i=l 


n.   n 
(of.    -  -i^-ij     (bf. 


n 


(5.5.2) 


X2  =   [ 


2  2 

n  ,    n  n  <  "v. 

(af    -  -J^)    (bf    -  - 

iJ* j.  iaj. — 


UJL) 


n  .  a 

.,1-o 


(5-5.3) 


=  I 


k=l 


'..k 


*j    (bf 


ViA, 


.  .k     n 


(5.5A) 


r   c 

•  •k  .-£-,  <£-,  id 


i=l  d=i 
c 


dk 


of  ,  =  y   I  bf .  ..  . 

These  three  Chi-squares,  xi>  X2,.  and  x|  are  distributed  as  x2  ran- 
dom variables  with  (r  -  l) ,  (c  -  l),  and  (b  -  1)  degrees  of  freedom. 

The  hypothesis  tested  is  that  the  population  means  of  different 
levels  for  all  three  main  factors  are  identical. 

The  total  interaction  x2  can  be  computed  by  subtracting  from  Xj  • 

2  _  2  _  2  _  2  _  2  (5.5.5) 

Xj  -  XT   XR   Xc   XB  •  \j  j-ji 

This   statistic  is   distributed  approximately  as  x2  -  distribution 

with     rcb  -r-c-b  +  2     degrees   of  freedom 


If  x?   :;   significant,  then  we  may  make  2xbxc,2xrxb,  and 
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2  j  r  i  s  contingency  tables  across  rows,  columns,  and  blocks  respectively. 
For  each  of  these  tables  we  can  com]       ,  as  RCx|.  KBjS2,,  and  CBx,2>  s° 
that  the  interactions  for  each  pair  of  two  main  factors  are 

HCx2  =  RCX2  -  x2,  -  x2,  (5-5.6) 

RBX2  =  ?£X2  "  X*  -  X2  (5.5.7) 

CBX2  =  C13X|  -  X2  -  X2  ■  (5.5.6) 

These  three  statistics  are  distributed  approximately  as  x2  distri- 
bution with  (r  -  l)(c  -  1),  (r  -  l)(b  -  l) ,  and  (c  -  l)(b  -  l)  degrees  of 
freedom  respectively. 

Finally,  the  triple  interaction  x2  of  row,  column,  and  block  is 
e:-:^ressed  as 

RBCx2  -  X^  -  X2  "  X2  -  X2  "  RCX2  -  BBx2  -  CBX2 

=  x2  -  RCX2  -  RBX2  -  CBx2  (5.5.9) 

which  is  approximately  distributed  as  a  x2  random  variable  with 

(r  -  l)(e  -  l)(b  -  l)  degrees  of  freedom. 

To  test  the  significance  of  all  the  x2  statistics  of  the  main  effects 

and  interactions  above,  we  may  compare  the  observed  x2-values  with  the 

conventional  x2  table  with  the  corresponding  degrees  of  freedom. 

6.  The  Expected  Frequencies 

6.1  Two-Way  Classification 

6.1.1   'i'  and  *j'  are  Both  'variates'. 

In  the  two  way  classification,  if  we  suppose  that  the  row  and  column 
■ef(  . -^d  to  as  treatment  and  block  respectively,  the  expected  frequencies 


2l» 

can  be 

■  .  ' 

the  hypothesis.   If  we  let  p   be  the  probability 

. 

;ted  at  random  :'rom  the  population  is  a  member  of 

all  obs 

ons  in  the  i"h  row  and  J*'  column  of  the  r  x  c  contingency 

tabic , 

and  let  p.  be 

the  probability  that  an  individual  is  a  member  of 

.th 

the  l 

row,  an 

let  x>       be  the  probability  that  an  individual  is  a 

of  the  jth  columr.  (In  this  case  n  is  fixed  from  sample  to  sample), 

then  ar. 

r  :.   c  probabi 

Lity  table  is  indicated  as  the  following  table, 

.1.1,  which  is 

formed  from  Table  2.2.2, 

Table  6.1.1 
r  x  e  Probability  Table 

Column 

Total 

Row 

1 

2                1             c 

1 

P1I 

P12      •••       Py     •••      Pic 

pl. 

2 

p21 

P22     •••       P2<j     •••     p2c 

P2. 

i 

pil 

Pi2     •••      Pij    •••     pic 

pi. 

r 

prl 

Pr2     ••■       Prj     ••■     Prc 

pr. 

Total 

p  . 
1 

P. 2     •••       P.j     •■•     P-  = 

1 

where 

r. . 
i,l  . 

Pij  =  n  ' 

af.,  +  bf., 
1.1     1.1 

n 

The 

;is  that  the  row  and  column  or  two  attributes  are  independent 

can  be 

written  in  the  form 
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H   :   P.,   =  p.    P    , 

O  lj  1.      .,! 

Lnst  H    :   p.,   r  p.    p    ,      (i   =  1,2,    ...,   r  and  J   =  1,2,    ...,    c).  from  some 
a       a j  i.    .j 

i   and  j  , 

)le  of  size  n  is  selected  and  n. ,  individuals  of  them  are  in 
the  cell  of  the  i   row  and  j   column,  then  the  chi-square  is  conven- 
tionally computed  as 

r       c   (n   -  np   ; 

x2 «  y    y  — 2J ^—  (6.1.1.1) 

L   L       np 
i=l  j=l       piJ 

with  (r  -  l)(c  -  l)  degrees  of  freedom.   Under  the  hypothesis,  this  expres- 
sion may  be  written  as 

r   c  (n. .  -  np.  p   ) 

x2  =   y     y  — =J x-  •■>      .  (6.1.1.2) 

•   1    4   1  "P-   P   • 

1=1  j=l     *1.  .J 

Since  the  p.   and  p   are  unknown,  it  is  necessary  to  estimate  them 
i.      .J 

from  the  sample. 

By  the  property  of  x2  >  the  X2_"test  can  ^e  used  if  the  estimates  are 

maximum  likelihood  estimates ,  with  one  degree  of  freedom  for  each  parameter 

r  c 

estimated.  Since  J  p.   =1  and  £  p   =  1,  there  are  r-l+c-l= 

i=l  1*        j=l  'J 

r  +  c  -  2  parameters  to  he  estimated;  hence  the  proper  number  of  degrees 

of  freedom  for  testing  the  independence  of  two  attributes  in  the  r  x  c 

contingency  table  is  if  =  re  -  1  -  (r+c-2)=  (r  -  l)(c  -  l). 

To  find  the  maximum  likelihood  estimates  of  the  p.   and  p   we  let 

i .      •  j 

n.   denote  the  sum  of  the  frequencies  in  the  i   row  and  let  n  .  denote 
l.  *  .j 

the  sum  of  the  frequencies  in  the  j   column.  Since  the  frequencies  n. 
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are  discrete,  the  likelihood  function  of  the  sample  is  the  probability  of 


obtaining  the  sample  in  the  order  occured.  Thus,  using  the  same  reasoning 

n    n£ 
?1       Po    •••  Pr 


nl    n2       nr 
it  used  to  arrive  at  p,   p,   . . .  p„  ,  the  likelihood  function  of 


the  sample  will  be  given  by 

r   c    r. 


n  n  p, 


^V«^« 


li    .  (6.1.1.3) 


But  b   cciuse  of  H   :   p.      =  p.   p       and  the  definition  of  n       and  n      , 

o   ij    i .  .  J  x  •      • J 

this  likelihood  function  reduces  to 


re        n 

n  n  (p.  p  ^  1J 


iV«L>*  ^ 


r   c    n.   r   c    n 

n  ip.  y  i  n  P  1J 


"  nij:  i=i  J=i  '•   i=i]=i'J 


r     ^  ni1   c     I   ni1 

-^  1  9i  t=1    *    n  P  r1  lJ 

i,J  1J  1=1         ** 


i  r    n.   c    n 

„nl  ,  ip.  l-  ip.'l.  (6.1.1.3) 

n  n.  1  .  *i.    ._-i  «J 

.  .  Ij  1=1    J-l 

1  >«J 


r-1 

How,  let  p   =  1  -  y  p.  ,  then 
i=l  i< 


r-1    n  r-1   n.   c    n 

l  =  t^-(i-  [p,  )r-   nP.  -  n_P   •*   (6.1.1.M 

n  n.r      1  i.  i=1     }mX   -J 

1  >U 
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log  L  =  n  log(l  - 

r. 

r-l       r-l 

.  )  +    y    n. 

l-     1=1  x 

log  p.  +  K 
i . 

(6.1.1. 

5) 

re  K  does  not  involve 

the  variable  p.  . 

Now ,  dif f erentj 

.ating  with 

respect  to  p.   and  setting  the  derivative  i 

2qual  to  zero  to 

find 

a  maximum, 

31og  h 

n       n. 

(6.1.1. 

6) 

**■     1- 

r-l    +  p. 

Sir. 

r-l 
ce  1  -  I  v±     ?  Pr-» 

i=l 

this  equation  is  equivalent  to 

o 
r. 

Pi.  "  nr>  r-i.  - 

Xn. 

i. 

(6.1.1. 

7) 

where  X  does  not  depend 

upon  the  index  i. 

Since  this  must 

hold 

for 

i  = 

1,2,  .  .  .  ,  r  and 

since 

r 

i  =  y-D.  =  xyn. 

_  =  Xn, 

(6.1.1. 

,8) 

it 

follows  that  X  =  1/n 

,  and  hence  that  the  maximum  likelihood 

estimate 

of 

p.  is 
ri. 

n. 

"i.    r. 

(6.1.1 

.9) 

By 

symmetry,  the  maximum  likelihood  estimate  of  p  .  is 

La 

(6.1.1 

.10) 

If  u   and  n  ,  in 
-i.     ' -i 

the  formula  (6.1.1 

.2)  are  replaced  by  their 

maximum  likelihood  esti 

mates,  the  x2  will 

become 

r   c  (r 
2    V   V 

n.  n  .  2 
i  .1     n 

(6.1.1 

.11) 

xz  -  1      I 

i-1  0=1 

a.     a 

i.   -.1 
a 
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(r  -  l)(c  -  I)  degrees  of  freedom,  but  we  should  notice  that  this 
statistic  is  distributed  as  a  x2  distribution  provided  that  n  is  suf- 
ficiently large  and  H  is  true, 
o 

6.1.2   'i'  is  a  'Way  of  Classification'  and  'J'  is  a  'Variate'. 

consider  the  row  a  way  of  classification,  then  the  r  x  c  prob- 

c 

ability  table  can  be  changed,  so  that     7  p. .   =  p.      =1  and  n.      is   fixed. 

j=l1J    i. 

So  for  such  a  row  the  likelihood  function  is 


(6.1.2.1) 


i . 

c          n. 

"  PiJ    ^ 
J=l     3 

c 

It  n,  ,1 

j=i1J 

Now  we  have  r  independent  sets  of  sizes  n  ,  n  , 
pendent  observations  such  that  n.  (i  =  1,  2,  .  .  . 


,  n   of  inde- 
r. 


,  r)  is  fixed  from 


sample  to  sample.  Under  the  hypothesis  that  p   for  any  column,  is  inde- 
ed 

pendent  of  row,  or  in  other  words, 

H  :  p.  .  =  c  .  (say) 
against  H  5*  H  , 


where  1  , ' s  are  arbitrary  positive  parameters   such  that 


I  9.   .  =     [  P.,   =  p.      =1,  we  have,  therefore, 

j=l  -J      }=1  1J        1- 


■  n. 


n  P 
J=l 


ij 


i/ 


L** 
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r    n.-.   c     J»y 

=  n  — - —  n  P. . 

In.,!  J 


n  n,  : 

=  i-i «1  ,  •J  •  (6.1.2.2) 

II  n.  "  J  x 

c 

Maximizing  log  L  with  respect  to  q  's  subject  to  £  q   =  1  we  obtain 

•3  J=1  -i 

n  , 
the  maximum  likelihood  solutions:      q       =  — ^  .      The  number  of  independent 

•J   n 

parameters  estimated  from  the  data  is  c-1,  and  hence  the  test  here  is 


of  freedom  r(c  -  l)  -  (c  -  l)  =  (r  -  l)(c  -  l)  and  whose  form  is 

n  1  2 
r  c  (n. ,  -  n.  — '-d-) 


X--11       ^   *'  n    •  (6-1.2.3) 


„2 

i  i 


The  result  of  the  case  of  ' i '  being  variate  and  ' J '  a  way  of  clas- 
sification may  be  obtained  as  the  same  manner  as  that  above. 
6.1.3  'i'  and  'j'  are  Both  'Ways  of  Classification'. 

The  row  and  column  of  the  contingency  table  are  both  ways  of  classif- 
ication.  If  we  suppose  n.   and  n   in  the  r  x  c  contingency  table  are 
both  fixed  from  sample  to  sample,  then  both  row  and  column  marginal 
probabilities  are  all  equal  to  1,  that  is 


r        c 

EPn"  b„"'l   or  p   =p   =1.  (6.1.3.1) 
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In  this  case  the 

chi-square  will  be 

n,   n       2 

X2  = 

r        c      (n.      -  ■■1-    ■■•>■) 
y       y         i,1           n 

i=l  ,]=1             i.    ..1 

(6.1.3 

.2) 

n 

with  re  -   (r  +  o  -  1) 

=   (r  -  J  )(e  -  1)   degrci 

;s  of  freedom. 

6.2     Three-way  Classi: 

£U  cation. 

6.2.1            'i',    'J'    and   'k'   Are  all   'Variates 

i 

Suppose  we  have  i 

i  s&jnple  of  independent 

observations   such 

that  p.Jk 

is  the  probability  of 

an  observation  in  the   1 

.ijk)        cell  and 

n 

is   fixed 

from  sample  to  saaple 

and  if  we  let 

r 

c                                    b 

&u*  *  ?-^' 

I  Pi1k  ■  Pt  k.          I  P. 

3=1  1Jlc        1,lc       k-i  J 

■Jk 

=  PiJ. 

[j '**■•••»• 

r  b                              cb 

I  Ipl1k  =  P  .   >    I  Ip, 

i,k  1J*          -J-        j,k3 

.jk 

=  o. 
"  i.  . 

(6.2.1, 

,1) 

r  c  b 

I  I  I  P.-  1k  =  P 
i,J,k     '" 

=  1 

then  the  likelihood  function  is   given  by 

T         __                                 *•■ 

i.J.k  lj* 

(6.2.1. 

2) 

Li    — 

■  n  >  a.  .J 

i,J,k     ijk 

under  the  hypothesis  c 

..'  independence  between 

■i' 

and   ' J ■    for 

fixed    'k'. 
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H  .  Pi,i*  _  Pi.k  P..1k 
°   P..k   P..kP..k 

or        H  :  p    =  ?i-k  P--tl! 

Bgatnst  Hay  Ho  (i  -  i,...,P;  j=i,...,0;  k=l,...,h). 

We  then  have 

,Pi  kp  Ik  "iJk 
ijk   p..k 

(6.2.1.3) 

Maximizing  log  L  with  respect  to  the  p.  ,'u,  p   's  and  d   's 

*i.k    *\  jk      *\  .k 

re              b 
subject  to  J   p._k  =  £  p    =  P__k  and  J  p    -  X  , 
l=i      J-1             k=l  ' ' 

gives  maximum-likelihood  solutions 

5     -5l* 

pi.k    n 

P.jk  =  n 

(6.2.1.1.) 

^■V- 

The  number  of  these  estimated  parameters  is  (r  -  l)b  +  (c  - 

Db 

+  (b  -  1).  The  x2  used  to  test  the  hypothesis  here  is 

n.  .  d    „ 
r   c   b  (n.  .,  -  -1-k   -.1k)2 

i=l  j=l  k=l    •■_.,  -.ik 

(6.2.1.5) 

n..k 
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with  rob  - 

1  -  b(r  -  1)  - 

b(c  -  1)  -  (b  - 

i)  = 

b(r 

-  1)<C  - 

l)  decrees 

of 

Under  ■ 

the  hypothesis 

that  p.  ,  =  p. 
l.k    l. 

.p..k 

and 

p.jk  =  p. 

,1.P..k 

we 

can  test 

the  independence  of  '  i1  and  ' 

k1  and  ',)' 

and  'k\ 

Also  if 

ve 

let  p.,, 

=  Pi..P.J.P..l 

,  then  we  have 

n.  ,, 

L  « 

n  (p.  p  ,  p 
ijk  1"  *J- 

k)  M*  . 

(6.2, 

.1.6) 

To 

test  the 

hypothesis  we 

maximize  log  L  with  respe 

:et  to 

Pi. 

.'s,p.o 

1  s  and  t>  ,  '  s 
..k 

subject  to 

r 

c 

b 

[P..   =  I  P  ,  • 
1=1  l"   j=l  -J- 

1    Ip   v  =  i 
k=i  •■* 

and  obtain  the  solutions  of  maximum  likelihood 

as: 

TO. 
"1.  . 

n. 

i . . 

n 

n 

5-J. 

n 

(6.2. 

1.7) 

-  .  .k 

_n..k 
n 

The  number  of  independent  parameters 

estii 

*.ated 

.  from  the 

data  is 

(r 

+  c  +  b  - 

•  3) ,  and  hence 

i  the  x2  used  to 

test 

the 

hypothesis  here  will 

.  be 

rob 

n.  n 

.n..k. 

2 

1 

X2 

1       I 
i=l  J=l  k=l 

Di..n..iA.k 

2 

(6.2. 

1.8) 

with  rcb  -  1  -  (r  +  c  +  b  -  3)  =  rcb  -r-c-b  +  2  degrees  of  freedom. 

In  order  to  test  the  hypothu         ,ae  independence  between 
'  (i ,  j  ) '  and  'k'  or 

Ko:  *ij*=*ij.»..k 

1st  H  4   H 
a    o 

(i  =  1,2,. ...r,  j  =  1,2 c,   k  =  l,2,...,b),  we  have 

L  -   "  (p,,  P  k)  iJk  (6.2.1.9) 

i.j.k 

To  test  this  hypothesis  we  maximize  log  L  with  respect  to  p.   's  and  p   's 

i,j  •        .  »k 
b 

subc 

j 


Ject  to  1       j   p. .  =  £  p    =1  and  obtain  the  maximum  likelihood 
i=l  j=l  1J'   k=l  - 


solutions  as  -d.,  =  — lL:- 
"ij .    n 

(6.2.1.10) 
P    -  "••*  . 

The  number  of  independent  parameters  estimated  from  the  data  is 
(re  -  1)  +  (b  -  l)  and  hence  the  x2  used  to  test  the  hypothesis  here  will 
be 

n .  ,:i  ,  2 
r   c   b  (n.„  -  1''-  ••1C) 

X2  =  A  A  v1,     °-  »"v (6-2"1-ll) 

i=l  J»l  k=l     i,i ■   ..k 

n 
with  rcb  -  1  -[(re  -  l)  +  (b  -  l)]  =  (re  -  l)(b  -  l)  degrees  of  freedom. 
The  hypothesis  of  independence  between  'i'  and  'k'  and  between  'J' 
'kf  is  included  under  the  hypothesis 

and  p    =  p    p    , 
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;.   the  independence  between    '  i  j '    and   'k' ,  and   'J'    and   'k', 
as  has  been  shovn  by  Roy  &  Kastenbaum  (1956). 

6.2.2     'i'    and   'J'    are    '\.  :    and   'k'    is   a   'Way  of  Classification'. 

Suppose  there  are  b  of  sizes  n        n         of  independent  observations 

such  that  n     ,     (k  =   l,...,b)    is   fixed  from  sample  to  sample  and  p.    ,    is 
.  .^  ijk 

th  re 

.lity  o:"  an  observation  in  the   (ijk)        cell,  and     )        7  p .  ,.    = 

i=l  j=l  1Jic 

p     .    =  1.     The  likelihood  function  is   given  by 


l  =    n     |        ••  -,       n  r>.      iJk 


i     r.n.  ,'.        .,   -ijk 
I   jj  ijk         ij 


(6.2.2.1) 


Under  the  hypothesis  of  independence  between  'i'  and  'j'  for  each  'k' 

that  is 

H  :  -o.  .,  =  p.  ,  p  ., 
o    ijk    i.k  .jk 

. .  .r.st   K  4   H 
a    o 

(i  =  1,2 r;  j  =  1,2,. ..,c;  k  =  1,2 b)  we  have 

L  «  n   (p.   p   ^   .  (6.2.2.2) 

ijk   l'*  -Jk 

We  maximize  log  L  with  respect  to  the  p.   's  and  p   's  subject  to 

i  .k        .  Jk 

r        c 

J  d.  ,  =  y  p    =  p    =1,  and  obtain  the  maximum  likelihood  solutions 
i-1  i-lc   j=l  •">*    "k 

~      i.k   ~      .  ja 
-l.k  °   n„j5  P.Jk    n..K  " 

aer  of  independent  parameters  estimated  from  the  data  is 
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•  -  l)  +  b(c  -  l)  and  hence  the  x2  used  to  test  the  hypothesis  here  is 

b 

X2  =  [ 

k=l 

-re             n.  ,  n    2  - 

i=l  l1=l   X-Jk      ••*      n..k 

(n..k    2    > 
n..k 

(6.2.2.3) 

with  b(rc  -  I)  -  b(r  -  l)  -  b(c  -  l)  =  b(r  -  l)(c  -  l)  degrees  of  freedom. 

For  the  hypothesis  p.   independent  of  'k' ,  or 

H  :  P.  .,  =  q.  ,   (say) 

-inst  K  4   H   (for  all  i,j  and  k),  we  have 

L  -  n,,  li*                                                                                       (6.2.2.4) 

-j-- 

..-  maximise  log  L  with  resuect  to  the  q.  .  's  subject  to  £  q.  .  =  1, 

and  obtain  the  maximum  likelihood  solutions: 

a.  .     =  ^i  .                                     (6.2.2.5) 
-ij.    n 

The  number  of  independent  parameters  to  be  estimated  from  the  data 

is  (re  -  1)  and  hence  the  statistic  x2  is 

b 

X2  =  .  [ 

k=l 

n    2  " 

r   c  (n. ,,  -  n  ,   x" ' ) 

r   r    i.1k    ...■:   n 

(6.2.2.6) 

L      I                   n 

trith   b(rc  -  -  N  -  (re  -  l)  =  (re  -  1Kb  -  l)  degrees  of  freedom. 
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.  I   'i'  is  a  'Variate'  and  'J1  and  'k'  arc  'Ways  of  Classification'. 


Consider  c  x  b  independent  sets  of  sizes  n    of  independent  obaer- 
vations,  such  that _n  fc  (j  =  l,2,...,c,  k  =  1,2,...  ,b)  is  fixed  from 

^e  to  sample  and  p.   is  the  probability  of  an  observation  in  the 

ljk 


,th 


(ijk)   cexl,  ar.d  \   p 


p  ,  =  1.  The  likelihood  function  is 


l  =  n 


•JUL. 


n  p. 


I  "ijk  i  ^ 


'ijk 


(6.2.3.1) 


•or.de:-  the  hypothesis,  that  for  any  'k' ,  p.   is  independent  of  'j1,  that 

ijk 


H   :     P-  .,    =  o.   ,     (say) 
o       *ijk       "i.k 


against  H     4  H     (for  all  i,j   and  k) ,  and 


)   c .        =     V  tj.  .,    =  p    .,    ■  1,  we  have 
-     i.k        .<•  -ijk         .jk  ' 


i=l 


i=l 


L 


•    ?  A- 
i,j,k 


ijk 


(6.2.3.2) 


ve  naximize  log  L  with  respect to  the  q.   ,  's  subject  to     £  q         =  1, 

i=l 

and  obtain  the  maximum  likelihood  solutions  as 


_  i.k 
i.k     n..k 


(6.2.3.3) 


The  number  or  independent  parameters  to  be  estimated  from  the  dat£ 
is  b(r  -  I)  and  hence  the  statistic  x  i^-sed  to  test  the  hypothesis  is 
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n.   2  - 
(n    -  n    -=j£) 

0    b 

r  ^ijk   ".jk  n  ,  ; 

x2=  I  I 

I     . _^A_ 

i=l   ,     i.k. 
-     ^  ^ 

(6.2. 

3.U) 

with  cb(r  -  1)  -  b(r  -  1)  =  b(r  -  l)(o  -  1)  degrees  of  freedom. 

Again  for  the  hypothesis  that  for  any  'j',  p    is  independent  of 

ijk 

'k'  that  is 

V  Pijk  =  a-ij.  (say) 

r        r 

against  H  4   H  (for  all  i,j,  and  k)  where  7  q    =  7  p    =  p    =1 

• 

As  mentioned  a'ooTe,  the  hypotheses 

V  pijk =  ^.x  (say) 

together  w 

Ho=  ?ijk  =  <1ij.  Uay)' 

Implie  that  p.,,  is  a  pure  function  of  'i',  i.e.  that 

IjK 

P«.i-  =  <L,        (say)  (for  all  i,  j  and  k). 

If,  in  a  one  way  classification  in  the  usual  analysis  of  variance, 

1 i '  corresponds  to  the  'variate' ,  'j'  to  the  'concomitant  variate'  and 

'k'  to  the  'way  of  classification' ,  then  it  will  be  seen  on  a  little 

reflection  that 

H  :  p.  .,  =  p.  -  t>  ., 
o   *ijk  v\.s.  -  .jk 

HQ  (i  =  1 r;  ■   =  l,...,c;  k=l,...,b) 

will  be  t!       :ue  of  the  hypothesis  of  no  regression,  and 

V  pi.ik  =  <1ij.  (say)> 

against  H  4   H  (for  all  i,J  and  k) 
a    o 

will  be  the  analogue  of  ^he  hypothesis  of  no  covariance. 

On  the  other  hand,  suppose  we  take  'j'  and  'k'  as  just  the  two  way 
classification,  for  example,  if  we  take  'J'  as,  say,  blocks  and  'k'  as, 
say,  treatments  in  a  randomized  complete  block  experiment  (with  more 
than  one  and  in  general  unequal  number  of  replications  in  each  cell). 
Then 

H  :   p.  .,  =  o.    (say) 

o    ljk    i.k 

against   K  4   H  (for  all  i,j  and  k) 
a    o 

will  be  the  analogue  of  no  block  effect  for  each  treatment  separately  and 

Hi  t>.  ..  =  a.  .   Uay) , 
o    ijk    ij  . 

against   H  r  K  (for  all  i,j  and  k) 
a    o 

will  be  the  analogue  of  'no  treatment  effect'  for  each  block  separately. 

In  other  words,  in  the  usual  parlance  of  analysis  of  variance, 

V  pij:<  =  <i.k  (say)- 

against   H  4   H  (for  all  i,  j  and  k) 

combines  the  hypothesis  of  'no  main  effect'  and  'no  interaction', 
while 

V  pijk  =  *ij.  (say)> 

against  H  -,-   H  (for  all  i,J  and  k) 

a    o 

le  hypotheses  of  another  'no  main  effect1  and  'no  interaction'. 
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7.  Normal  Score  Transformation. 

R.   A.  Fisher  (19U3)  designed  a  normal  score  transformation  for  the 

characteristics  of  various  objects  can  not  be  measured 
numerically,  but  they  can  be  ranked  in  an  orderly  sequence,  such  kinds  of 

..  Judging  ice  cream,  bread,  cake,  candy,  chocolate,  all  food  tests, 
tea  and  coffee  tests,  and  furthermore  tests  for  clothing,  sports,  cars, 
courses,  etc.  We  may  not  express  our  preference  in  a  quantitative  measure, 
but  we  can  rank  the  different  flavors,  as  1,  2,  3  and  so  on.  For  this 
ranked  data  we  can  replace  each  rank  by  a  normal  score  which  can  be  found 
in  the  statistical  table  for  Biological  Agricultural  and  Medical  Research 

of  Fisher  and  Yates  (19^3).  This  table  gives  the  average  deviate  of  the 

t- 
r   largest  of  samples  of  n  observations  drawn  from  a  normal  distribution 

which  has  a  unit  variance;  that  is,  if  X.  ,  >  X,.,  >  .  .  .  >  X,  ,  is  an 

IjJ  —     \£>  ~  ~     W 

ordered  sample  from  a  standard  normal  distribution,  the  table  gives 

E(X(r)). 

The  application  of  this  table  is  very  simple.  We  now  consider  an 

example  of  the  ranked  and  randomized  complete  block  design.  Four  flavors 

of  ice  cream  were  evaluated  by  10  Judges.  Each  Judge  ranked  the  flavors, 

1,2,3,  or  k   with  1  being  the  most  preferred,  and  with  the  results  in  the 

following  Table  7.1. 
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Table  7.1 
r  Testing  the  Flavors  of  Four  lee  creams 


Flavor 

Juc 

A 

B 

C 

D 

1 

2 

1 

It 

3 

2 

1 

2 

3     ' 

It    ■ 

3 

2 

1 

1* 

3 

'It 

3 

2 

It 

1 

5 

2 

1 

It 

3 

o 

2 

3 

It 

1 

7 

1 

2 

3 

It 

8 

2 

1 

It 

3 

9 

2 

1 

It 

3 

10 

3 

1 

2 

H 

After  ve  transform  the  ranks  in  the  table  into  normal  scores  ve  may 
have  the  nev  two-way  Table  7.2. 


Table  7.2 
Mormal  Score  Transformed  Data  from  Table  7-1 

~  Flavor 


Jt  age 

A 

3 

C 

D 

Total 

1 

0.30 

1.03 

-1.03 

-0.30 

0 

2 

1.03 

0.30 

-0.30 

-1.03 

0 

3 

0.30 

1.03 

-1.03 

-0.30 

0 

It 

-0.30 

0.30 

-1.03 

1.03 

0 

5 

0.30 

1.03 

-1.03 

-0.30 

0 

6 

0.30 

-0.30 

-1.03 

1.03 

0 

7 

1.03 

0.3C 

-0.30 

-1.03 

0 

a 

0.30 

1.03 

-1.03 

-0.30 

0 

9 

0.30 

1.03 

-1.03 

-0.30 

0 

10 

-0 .  30 

1.03 

3.30 

-1.03 

0 

rotal 

3.260 

6.780 

-7.510 

-2.530 

0 

1*1 


For  a  we  c:tr.  consider  the  judges  as  blocks  and  flavor::  as 
:  i::  a  randomized  complete  block  design  and  then  do  the  con- 
ventional .ariance.   The  results  obtained  are  shown  in  tne 
follov.  V.3. 


Table  f.j 

ysis  of  Variance  Table  for  Testing 
._  Flavors  of  Four  Ice  creams 


Source  of 
Variation 

D?      '    Sum  of  Squares 

Mean  Square              ?-Value 

Trea-. 

Error 

3 

27 

11.9397 

11.0733 

3.9799 

0.14103 

9.6990 

Tote 

30             23.0179 

And  also  if  we  use  a  5?  significant  level,  the  multiple  range  test  results 
are  as  follows. 


Treatment 

Mean 

C 

-0.7510, 

D 

-0.253o' I 

A 

0.3260, 1 

E 

O.6280' 

Here  we  should  note  that  since  the  block  totals  are  zero,  we  are  not 
able  to  find  differences  among  blocks.  The  block  degrees  of  freedom  should 

subtracted  from  that  of  the  to^al.   The  normal  score  transformation  may 
-pply  not  only  on  ranked  data  but  also  on  quantitative  data,  and  second 

terical  example  shows  the  analysis  of  variance  for  the  normal  score  trans- 
id  ran<  :-izec.  complete  block  data.  The  data  includes  5  treatments  and 
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10  blookn.      The   transformed  scores   and  the  results   of  analysis   of  variance 
following  Table  7»5  and  J.6\ 


Table  f,k 
Two-Way  Table  of  Randomized  Complete  Block  Design 


BXc  ■ 

Treatment 

1 

2 

3 

1* 

5 

1 

1*6 

50 

69 

1*8 

l*i* 

2 

1*8 

1(6 

1*7 

60 

1*0 

3 

32 

50 

1*6 

51* 

59 

It 

Us 

U8 

65 

1*7 

1*1* 

5 

39 

37 

1*9 

50 

55 

6 

1*8 

58 

59' 

68 

50 

7 

1*9 

50 

1*2 

58 

1*7 

8 

30 

1*1* 

63 

1*6 

71 

9 

1+8 

1*0 

1*7 

.  k6 

1*3 

10 

31+ 

39 

1*7 

37 

55 

Table  7-5 
Normal  Score  Transformed  Data  from  Table  7-1* 


Block 

Treatment 

1 

2 

3 

1* 

5 

1 

-0.50 

0.50 

1.16 

0.00 

-1.16 

2 

0 . 5  3 

-0.50 

0.00 

1.16 

-1.16 

3 

-1.16 

0.00 

-0.50 

0.50 

1.16 

1* 

-1.16 

0.50 

1.16 

0.00 

-0.50 

5 

-0.50 

-1.16 

0.00 

0.50 

1.16 

6 

-1.16 

0.00 

0.50 

1.16 

-0.50 

7 

0.00 

0.50 

-1.16 

1.16 

-0.50 

3 

-1.16 

-0.50 

0.50 

0.00 

1.16 

9 

1.1b 

-1.16 

0.50 

0.00 

-0.50 

10 

-1.16 

0.00 

0.50 

-0.50 

1.16 

-5.-1* 

-1.82 

2.66 

3.93 

0.32 

1.3 


e   7.6 
'or  the  Data  in  Table  7-5 


Source  of  Variation 

DF 

Sum  of  Square 

Mean  of  Square 

F-value 

Trea. . 

"• 

5.275!»0 

1.3138 

1.78 

Error 

36 

26.63696 

0.7399 

Total 

1*0 

31.91200 

.'-.l£;o,  the  grand  total  is  equal  to  zero  and  all  the  block  totals  are 

equal  to  zero,  so  the  component  of  blocks  is  completely  eliminated.  The 

t   b  , 
total  sum  of  scuares  is  just  J       I  7   •     Also  the  number  of  degrees  of 

1-1  j=l 
freedom  for  the  total  sum  of  square  is  reduced,  because  the  component  of 

blocks  is  eliminated. 

la  using  the  normal  score  transformation,  ties  are  permitted.  If 
two  ranks  or  observations  in  the  same  block  are  identical,  the  average 

e  corresponding  normal  scores  is  used. 

Furthermore,  for  the  randomized  complete  block  design,  this  transfor- 
mation can  be  extended  to  two  factors  or  more  than  two  factorial  experi- 
I)  s.   In  this  c^e,  each  of  the  treatments  can  be  divided  into  several 
levels.  Then  the  experiment  becomes  the  factorial  type.  After  the  trans- 
formation is  made  for  these  kinds  of  experiment  as  above,  then  the  conven- 
tional analysis  of  variance  or  even  regression  can  also  be  used. 

For  food  test  experiments,  because  it  is  not  easy  to  rank  more  than 
It  products  effectively  at  a  time,  this  method  is  limited.  Fisher's  normal 
v  table  can  be  applied  for  up  to  50  treatments. 


. 


kk 


Lnomial  Population. 

convenience,  w    scribe  the  rela  een  the  x  -test 

.  in  the  complete  randomized  desicn  case,  when  oxn.rr.in:. 

dal  populations,  as  in  the  test  for  equal  r.edians. 

.\;  know,  the  means  of  sample  size  n  drawn  from  an  ordinary  bi- 

pulation  with  p  and  q  which  are  not  necessarily  equal  follow 

he  normal  distribution  with  the  population  mean  equal  to 
p  ana         equal  to  p(l  -  p)/n.  Then  the  sample  means,  y\'s  may  be 
considered         of  t  (number  of  treatment)  observations  drawn  from 
a  normal  population  with  mean  equal  to  p  and  variance  equal  to  p(l  -  p)/n. 
From  this  and  by  definition,  the  x2-statistic  is  given  by 
*        =9 

I  fo  -  y) 

v2  -  1=1    


p(l  -  p) 


t       =  2 

n  I   (y,  -  y) 

i=l  ~ 

p(l  -  p) 


Among  sa.-.7)le  SS 
pll  -  P) 


(8.1) 


„2 


where  y  is  the  mean  of  y..  This  x2  will  follow  approximately  the  x  dis- 

(t  -  1)  deerees  of  freedom.  Since  the  variance  of  a  bi- 
nomial population  is  equal  to  p(l  -  p),  the  y(l  -  y)  may  be  used  as  pooled 
estimate  of  p(l  -  p) ,  and  then 

t        _  2 

I   n(y,  -  y) 
2   j-1 Amor.!-  sample  SS  (8.2) 

y(l  -  y)       ?(1  -  y) 


y   as   a  chi-squarc  random  variable  with   (t  -  i) 

...    if  p  is   specified,   then  we   can  use  pq  to  estimate 

p(l   " 

For  the  F-statistic  we  commonly  use 

A;;,  SS 

t  -  1 _  Ar.or.fr  samplc.  MS  (8.3) 

~  Within  saaole  SS  ~  Within  sample  MS 

lu-t 
with  (t  -  1)  and  (;  n-  t)  degrees  of  freedom.  This  means  that  the  two 
statistics  x2  ar-d  F>  ar-  similar,  because  the  x2  n»ay  te  expressed  in  a 
.ct  resembles  an  F  statistic, 

..on.-r  5c;ir.V)le  SS 
v2        t  -  1 


yd  -  y) 


:■-:-  sample  "S 

yd  -  y) 


(8.U) 


with  t  -  1  and  <*  decrees  of  freedom. 

Notice  that  in  this  case  the  within  sample  mean  square  is  replaced 

by  y(l  -  y).  Tr.is  is  the  difference  between  normal  and  binomial  population 

2 
cases.  For  normal  population,  a2  is  directly  estimated  by  s  ,  the  error 

squares,  and  for  binomial  population  a2  =  p(l  -  p).  So  in  a  basic 

sense,  these  two  tests  x2  ar'd  F' >  are  similar. 

,  we  car.  consider  the  tern,  y(l  -  y),  which  is  the  total  mean 

re  ,  because  in  a  binomial  population,  the  observations  y's  are  both 

r.d  l's,  the  grand  total  is  £y  =  Jl   =  G  (say),  the  total  SS  is 


1(6 


-  —  ,  and  the  total  mean  sauare  is  approximately  equal  to 


: 


:; 


<*-> 


L  L. 


=  y-y2  =  y(i-y)  (8-5) 

re  just  replaced  the  total  degrees  of  freedom  £n  -  1  by  £n.  So, 
..ently,  the  total  mean  square  is  only  slightly  greater  than  yd  -  y). 
Furthermore,  the  total  mean  square  is  the  weighted  average  of  the  among 
sample  and  within  sample  mean  so.uares ,  with  their  number  of  degrees  of 
freedom  "Deing  the  weights. 

In  our  case,  we  used  the  pooled  median  as  a  cutting  point  to  trans- 
fer::, the  data  intc  the  binomi-al  form,  and  to  test  the  hypothesis  that  the 
t  treatment  populations  have  the  same  median,  that  isp=l-p=q=0.5. 
Under  this  case  we  may  replace  the  term  y(l  -  y)  by  p(l  -  p)  =  pq  =  l/1*. 

From  the  discussion  above  we  see  the  x  -test  is  equivalent  to  the 
analysis  of  variance,  if  we  use  the  total  mean  square  as  the  error  term. 
That  is  to  say,  the  x2-test  and  the  analysis  of  variance  usually  yield 
the  same  conclusion  in  testing  the  hypothesis  that  t  population  means  are 
equal. 

9.   Comments  and  Discussion. 

9.1  Basic  Technique. 

The  basic  technique  of  the  non-parametric  methods  in  this  report  is 
contingency  table  based  on  a  pooled  median.  If  the  dimensions 
!  liable  are  £  v.   2  the  aata  may  be  interpreted  as  two  samples  drawn 
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WO  binomial  populations.   If  the  dimensions  are  2  x  t  (or  2  x  y)  > 

ted  as  t  samples  drawn  front  t  binomial  populations, 
Iso  as  2  samples  drawn  from  t-attributes  multinomial  populations. 
If  the  dimensions  are  t  ; c  b  (or  r  x  c;,  the  data  may  be  either  interpreted 
as  r  rai       pies  drawn  from  c  attributes  multinomial  populations  or  c 
randor.  samples  dr      ■-  r  attributes  multinomial  populations.  Either 
interpretation  may  yield      result. 

9.2  Application  of  Mean  or  Median. 

As  we  know  that  the  normal  population  is  symmetric,  the  mean  and  the 
median  are  equal.  So  all  of  the  discussion  concerning  the  mean  also  per- 
^£ir.3  to  the  median.   The  test  of  the  hypothesis  that  the  t  population 

as  are  equal  is  the  same  test  as  for  t  population  medians  being  equal. 
For  the  binomial  population  in  this  report  all  the  discussion  about  tests 
of  hypotheses  is  about  the  median  instead  of  the  mean.  The  median  has  an 
important  property;  that  is,  the  median  is  transformable.  For  example, 
for  the  5  observations  l1*,  15,  26,  100,  125,  the  median  is  26  and  the  mean 
is  56.   Suppose  we  use  the  square  root  transformation,  then  the  corres- 
ponding transformed  values  are  3.1h,   3.8f,  5-10,  10.00,  11. 18,  where  the 
transformed  median  is  5.10,  which  is  the  square  root  of  the  original  median 
26,  but  the   n  is  6.73,  which  is  no  longer  the  square  root  of  the  orig- 
inal mean  56.  For  any  transformation  this  is  true,  so  when  we  use  a  trans- 
formaticn  with  the  analysis  of  variance,  we  are  actually  making  comparisons 
among  the  medians  on  the  original  scale.  In  this  report  for  cases  in  which 
llation  is  not  normal,  the  mean  and  median  may  not  be  the  same,  so 
..:  is  used  iirectly  for  the  transformation. 


In  section  8,  we  see  that  the  x2-test  is  similar  to  the  F'test. 
to  1     -ger  than  F' ,  but  the  corresponding  F-value 
in  the  table  is  ..    th     F" ,  because  the  degrees  of 

-..or  of  F'  is  larger.  The  x2  test  seems  to  have 
a  sli;         :r  probability  of  committing  a  Type  II  error  than  has  the 
analysis  of  rariance.     rever,  the  F-test  is  also  not  beyond  reproach, 
because  the  populations  are  binomial  and  not  normal.   If  the  population 
io  not  normal,  the  analysis  of  variance  tends  to  reject  the  true  hypothesis 
more  frequently  than  the  significance  level  specified.  Therefore,  the 
?-^est  see.ns  to  have  a  higher  probability  of  committing  a  Type  I  error 
than  that  of  x2-test. 

9.3  Individual  Degree  of  Freedom. 

individual  degree  of  freedom  can  be  used  on  any  contingency 
table  except  that  of  2  x  2  in  which  case  the  number  of  degrees  of  freedom 
is  already  equal  to  1.  The  basic  technique  of  the  individual  degree  of 
freedom  is  to  reduce  the  dimension  of  the  contingency  table  to  2  x  2  out 
of  the  r  x  c  contingency  table.  The  purpose  of  the  individual  degree  of 
freedom  is  to  increase  the  power  of  the  test. 

9.h     Sheffield's  Comments. 

affield  (1957)  reinterpreted  Wilson's  method  in  a  similar  manner. 
Ke  considered  that  the  hypothesis  in  Wilson's  method  is  that  each  obser- 
vation in  a  cell  has  50$  chance  of  falling  above  the  pooled  median.  If 
n  is  the  number  of  observations  per  cell,  then  the  range  of  the  possible 

jencies  above  the  rr.edi  n  is  fron  0  to  n,  and  the  mean  is  equal  to 
n/2.   The  variance  of  a  frequc.-.cy  is  npq  or  n(0.5)(0.5)  =  n/k ,  since  the 
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thesis  is  that  p  =  q,  «=  0.5.     Be  repeated  the  cx.ar.ple  with  the  3x3 
factorial  experiment  including  16  replicates  in  each  cell.     The  ran 

of  observation  in  each  cell  is   ;'rom  0  to  lC.      The  mean  of  each   cell   is 
16/2  =  8,   and  v_-.-iar.ce  of  cell  is   l6/-t  =  It.     The  obtained  frequency  table 


Table  9.U.1 
The  Fictitious   3x3  Factorial  Experimental  Data 


illumination 

Lais 

1 

2 

"3 

Total 

A 

111 

12 

11 

37 

B 

9 

7 

8 

21 

C 

6 

3 

2 

11 

lotal 

29 

22 

21 

72 

and  the  analysis  of  variance  is  as  follows. 

Table  9.1».2 
Analysis  of  Variance  Table  for  the  Data  in  Table  9.4.1 


Source  of 
Variation 

DF 

SS 

MS 

F 

P 

Wilson's 
X2 

P 

Dials 

2 

112.67 

56.34 

14.08 

<0.01 

28.168 

<  0.15* 

Illumination 

2 

12.67 

6.34 

1.58 

>0.05 

3.138 

102 

Interaction 

1» 

2.67 

0.67 

0.17 

— 

0.661* 

Total 

3 

128.00 

16.00 

4.00 

<0 .  01 

.      '  for  illumination  is  not  at  all  significant 
..  :t.t1c  test  but  would  be  well  with!  .  the  5%  level  if  tested 
a  tl      conventional  way,   and  he  also  .-.entior.sd  that  in  a  typical  3x3 
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fact<  riment  with  only  one  observation  per  cell,  there  is 

thin  cell  error  because  of  the  lack  of  replications.  The  only  error 
tent  available  in  such  a  case  is  the  interaction  of  the  two  marginal 

If  the  parametric  approach  or  F-test  is  applied,  the  F-value 
for  illumination  against  interaction  is  C.'i'k/O.S'J   =  9.5,  which  is  well 
beyond  the  6.91*  needed  at  the  5&  level  for  2  and  k   degrees  of  freedom. 
The  corresponding  nonparaaetric  test  (F  =  1.58)  does  not  even  reach  the 
20£  level  of  confidence. 

Sheffield  concluded  with  the  comment  that  Wilson's  test  involves  two 
parts:   first  the  procedure  for  creating  approximately  normal  data  from 
the  original  nonnormal  data  with  cutting  by  a  pooled  median;  second  the 
procedure  for  testing  obtained  variance,  npq.  Only  the  second  part  of  the 
method  is  the  distribution- free  part. 

9.5  McHemar's  Comments. 

McNemar  (1957)  contrasted  the  results  of  Wilson's  test  and  the  F  -  test 
for  some  data  of  two-way  classification  which  are  published  in  other 
textbooks.  From  the  levels  of  significance  reached  by  way  of  F  -  test 
and  Wilson's  test,  most  of  them,  for  row  effects,  column  effects,  and  for 
interaction  effects,  indicated  that  the  probabilities  of  reaching  the 
significance  needed  for  the  F  -  test  is  smaller  than  that  of  Wilson's  test, 
so  the  power  of  Wilson's  test  is  much  lower  than  that  of  F  -  test. 


51 


3J0WLEDGEMENT 

I  am  -_-d  to  Dr.  W.  J.  Conover  for  his  advice  and 

.ce  in  the  preparation  of  this  report. 


52 

REFERENCES 

aneroft,  T.  A.  (1952;.   rvaisticri^  theory  in 

.'■■■  ■  :-'■•.■': .   McGraw-Hill  Book  Co.,  Inc.  Jew  York. 

j,   J.  V.  (i960).   Distribution  -  Tr  :e  Statistical  Ter.tr,  WADD 
-   '.  -  57-79. 
Chapman,  D.  G.  and  Meng,  R.  C.  (i960).  The  Power  of  Chi-oquare  Tests 

for  Contingency  Tables.   J.  of  the  Aser.  Stat.  Asso_c. __6_1  965-975. 
Chew,  V.  (1952;.   Exnerirer.tal  Desirns  in  Industry.   John  Wiley  & 

Sons,  Inc.  New  York. 
Crar.er,  H.  (19^6).   Xatherratical  Methods  of  Statistics.   Princeton 

tfci.  Press. 
Fi  ler,  R.  A.  and  Yates,  F.  (19^3).  Statistical  Tables  for  Biological, 
■  ■■■.rrl cultural  and  Medical  Research.   Oliver  and  Eoyd  Ltd.  London. 
Fisher,  R.  A.  (195C).   Contribution  to  Mathematical  Statistics.   John 

Wiley  &  Sons,  Inc.  Hew  York. 
Fisher,  R.  A.  (1917).   The  Design  of  Experiments.   Oliver  &  Boyd,  London. 
Goodman,  L.  A.  (l9oii).   Simple  Methods  for  Analyzing  Three  Factors 

Interaction  in  Contingency  Tables .   J^  of  the  Air.er.  Stat.  Assoc.  59  319 
-353. 
Keel,  P.  G.  (195M.   Introduction  to  Mathematical  Statistics.   John  Wiley 

i.  Sons,  Inc.  New  York. 
Keirpthorne,  0.  (1952).   The  Design  and  Analysis  of  Experiments.   John 

Wiley  &  Sons,  Inc.  New  York. 
Li,  J.  C.  '.    (1961*).   Statistical    '~~r.ee   I.   Edwards  Brothers,  Inc. 
Mich   :. 

•   SI  tistic  -.1  Ar  ".--si:  __  Biol  p-y.   Methuen  and  Co. 
Lti  ,  Lor   .. . 


1TRIC  STATISTICAL  METHODS  FOR 
[E  RANDOMIZED  COMPLETE  BLOCK  DESIGN 

LI-CHUN  TAO 
B.  S.  ,  Taiwan  Provincial  Chung-hsing  University,  195** 


.  ABSTRACT  OF  A  MASTER'S  REPORT 
submitted  in  partial  fulfillment  of  the 
requirements  for  the  degree 

MASTER  OF  SCIE2^CE 

Department  of  Statistics  and 
Computer  Science 


KANSAS  STATE  UNIVERSITY 
Mann  at  t  an ,  Kan  s  as 


1966 


11 


purpose  of  this  report  is  to  introduce  the  application  of  the 
:  test  as  a  nonparair.etric  test  on  the  randomized  complete  block 
. .   This  test  is  one  of  the  large  sample  methods.   So  before  we  can 
-o  use  of  this  method,  ve  should  have  large  samples.  The  minimum  sample 
size  car.  ho  obtained  from  the  working  rule  given  in  Section  2. 

In  order  to  use  the  chi-square  test  for  the  randomized  complete  block 
ign,  we  first  of  all  need  to  change  the  ranacmized  complete  block  two 
way  table  into  a  two  way  contingency  table.   In  other  words,  we  have  to 
transform  the      r.uous  data  into  discrete  multinomial  data  with  the  median 
as  a  cutting  point.   A  multinomial  data  set  is  a  set  of  observations  which 
can  be  classified  into  r  categories.   If  r  =  2  the  multinomial  data  become 
binomial  data.  The  method  for  this  transformation  is  called  the  binomial 
trans  format  i  or.  and  is  stated  in  the  second  section. 

In  the  third  section  we  stated  that  the  test  of  independence  between 
two  attributes  in  x2-test,  is  comparable  to  the  test  of  interaction  between 
two  attributes  in  the  analysis  of  variance  case. 

-.;  fourth  and  fifth  sections  deal  with  the  methods  to  compute  various 
X2's  concerned  with  different  types  of  experimental  data  to  test  the 
hypotheses  that  the  treatment  population  means  are  the  same,  in  which,  of 
course,  the  contingency  table  should  be  formed  at  first.   In  the  discussion 
we  started  with  one  observation  and  then  more  observations  per  cell  data. 
An  extension  of  the  methods  applies  to  factorial  experiments  on  the  randomized 
complete  block  design,  in  which  both  no  combination  and  combinations  among 
levels  of  factors  are  discussed.   The  various  x2,s  are  computed  to  test 
hypotheses  about  the  significance  of  the  different  main  effects  and 
interaction  effects. 
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ixth  section  contains  the  concepts  of  the  expected  frequencies 
of  two  and  three  way  classification.   The  method  of  the 

derivation  of  the  expected  frequencies  used  is  the  maximum  likelihood 
method. 

In  the  seventh  section  appears  a  normal  score  transformation.  This 
is  introduced  by  Fisher  and  Yates  (19^3)  and  is  used  for  the  analysis  of 

ranked  data.   If  we  transform  the  quantitative  data  into  ranks  at  first, 
the  numerical  data  can  also  be  analyzed  by  this  method.   After  the  normal 
score  transformation  has  been  made  all  the  methods  used  in  normal  populations 
can  be  used  in  the  ranked  data. 

st  two  sections  compared  the  x2-test  and  the  F-test ,  and  the 
situations  of  using  mean  and  median.  The  F-test  is  better  for  normal  pop- 
ulations and  the  x2-test  needs  larger  samples  to  have  the  same  power  as 
the  F-test.  Since  the  normal  distribution  is  symmetrical,  the  mean  and 
median  are  tested  in  normally  distributed  data  while  only  the  median  is 
compared  in  bincmially  distributed  data. 


